1,472 research outputs found

    Tracking repeats using significance and transitivty.

    Get PDF
    transitivity; extreme value distribution Motivation: Internal repeats in coding sequences correspond to structural and functional units of proteins. Moreover, duplication of fragments of coding sequences is known to be a mechanism to facilitate evolution. Identification of repeats is crucial to shed light on the function and structure of proteins, and explain their evolutionary past. The task is difficult because during the course of evolution many repeats diverged beyond recognition. Results: We introduce a new method TRUST, for ab-initio determination of internal repeats in proteins. It provides an improvement in prediction quality as compared to alternative state-of-the-art methods. The increased sensitivity and accuracy of the method is achieved by exploiting the concept of transitivity of alignments. Starting from significant local suboptimal alignments, the application of transitivity allows us to: 1) identify distant repeat homologues for which no alignments were found; 2) gain confidence about consistently well-aligned regions; and 3) recognize and reduce the contribution of nonhomologous repeats. This reassessment step enables us to derive a virtually noise-free profile representing a generalized repeat with high fidelity. We also obtained superior specificity by employing rigid statistical testing for self-sequence and profile-sequence alignments. Assessment was done using a database of repeat annotations based on structural superpositioning. The results show that TRUST is a useful and reliable tool for mining tandem and non-tandem repeats in protein sequence databases, able to predict multiple repeat types with varying intervening segments within a single sequence

    Aubergene - a sensitive genome alignment tool.

    Get PDF
    Motivation: The accumulation of genome sequences will only accelerate in the coming years. We aim to use this abundance of data to improve the quality of genomic alignments and devise a method which is capable of detecting regions evolving under weak or no evolutionary constraints. Results: We describe a genome alignment program AuberGene, which explores the idea of transitivity of local alignments. Assessment of the program was done based on a 2 Mbp genomic region containing the CFTR gene of 13 species. In this region, we can identify 53% of human sequence sharing common ancestry with mouse, as compared with 44% found using the usual pairwise alignment. Between human and tetraodon 93 orthologous exons are found, as compared with 77 detected by the pairwise human-tetraodon comparison. AuberGene allows the user to (1) identify distant, previously undetected, conserved orthogonal regions such as ORFs or regulatory regions; (2) identify neutrally evolving regions in related species which are often overlooked by other alignment programs; (3) recognize false orthologous genomic regions. The increased sensitivity of the method is not obtained at the cost of reduced specificity. Our results suggest that, over the CFTR region, human shares 10% more sequence with mouse than previously thought (∼50%, instead of 40% found with the pairwise alignment). © 2006 Oxford University Press

    FACIL: Fast and Accurate Genetic Code Inference and Logo

    Get PDF
    Motivation: The intensification of DNA sequencing will increasingly unveil uncharacterized species with potential alternative genetic codes. A total of 0.65% of the DNA sequences currently in Genbank encode their proteins with a variant genetic code, and these exceptions occur in many unrelated taxa. Results: We introduce FACIL (Fast and Accurate genetic Code Inference and Logo), a fast and reliable tool to evaluate nucleic acid sequences for their genetic code that detects alternative codes even in species distantly related to known organisms. To illustrate this, we apply FACIL to a set of mitochondrial genomic contigs of Globobulimina pseudospinescens. This foraminifer does not have any sequenced close relative in the databases, yet we infer its alternative genetic code with high confidence values. Results are intuitively visualized in a Genetic Code Logo

    A strategy to incorporate prior knowledge into correlation network cutoff selection

    Get PDF
    Correlation networks are frequently used to statistically extract biological interactions between omics markers. Network edge selection is typically based on the statistical significance of the correlation coefficients. This procedure, however, is not guaranteed to capture biological mechanisms. We here propose an alternative approach for network reconstruction: a cutoff selection algorithm that maximizes the overlap of the inferred network with available prior knowledge. We first evaluate the approach on IgG glycomics data, for which the biochemical pathway is known and well-characterized. Importantly, even in the case of incomplete or incorrect prior knowledge, the optimal network is close to the true optimum. We then demonstrate the generalizability of the approach with applications to untargeted metabolomics and transcriptomics data. For the transcriptomics case, we demonstrate that the optimized network is superior to statistical networks in systematically retrieving interactions that were not included in the biological reference used for optimization

    eggNOG 6.0: enabling comparative genomics across 12 535 organisms

    Full text link
    The eggNOG (evolutionary gene genealogy Non-supervised Orthologous Groups) database is a bioinformatics resource providing orthology data and comprehensive functional information for organisms from all domains of life. Here, we present a major update of the database and website (version 6.0), which increases the number of covered organisms to 12 535 reference species, expands functional annotations, and implements new functionality. In total, eggNOG 6.0 provides a hierarchy of over 17M orthologous groups (OGs) computed at 1601 taxonomic levels, spanning 10 756 bacterial, 457 archaeal and 1322 eukaryotic organisms. OGs have been thoroughly annotated using recent knowledge from functional databases, including KEGG, Gene Ontology, UniProtKB, BiGG, CAZy, CARD, PFAM and SMART. eggNOG also offers phylogenetic trees for all OGs, maximising utility and versatility for end users while allowing researchers to investigate the evolutionary history of speciation and duplication events as well as the phylogenetic distribution of functional terms within each OG. Furthermore, the eggNOG 6.0 website contains new functionality to mine orthology and functional data with ease, including the possibility of generating phylogenetic profiles for multiple OGs across species or identifying single-copy OGs at custom taxonomic levels. eggNOG 6.0 is available at http://eggnog6.embl.de

    Spina bifida-predisposing heterozygous mutations in Planar Cell Polarity genes and Zic2 reduce bone mass in young mice

    Get PDF
    Fractures are a common comorbidity in children with the neural tube defect (NTD) spina bifida. Mutations in the Wnt/planar cell polarity (PCP) pathway contribute to NTDs in humans and mice, but whether this pathway independently determines bone mass is poorly understood. Here, we first confirmed that core Wnt/PCP components are expressed in osteoblasts and osteoclasts in vitro. In vivo, we performed detailed µCT comparisons of bone structure in tibiae from young male mice heterozygous for NTD-associated mutations versus WT littermates. PCP signalling disruption caused by Vangl2 (Vangl2Lp/+) or Celsr1 (Celsr1Crsh/+) mutations significantly reduced trabecular bone mass and distal tibial cortical thickness. NTD-associated mutations in non-PCP transcription factors were also investigated. Pax3 mutation (Pax3Sp2H/+) had minimal effects on bone mass. Zic2 mutation (Zic2Ku/+) significantly altered the position of the tibia/fibula junction and diminished cortical bone in the proximal tibia. Beyond these genes, we bioinformatically documented the known extent of shared genetic networks between NTDs and bone properties. 46 genes involved in neural tube closure are annotated with bone-related ontologies. These findings document shared genetic networks between spina bifida risk and bone structure, including PCP components and Zic2. Genetic variants which predispose to spina bifida may therefore independently diminish bone mass

    Timed inhibition of CDC7 increases CRISPR-Cas9 mediated templated repair.

    Get PDF
    Repair of double strand DNA breaks (DSBs) can result in gene disruption or gene modification via homology directed repair (HDR) from donor DNA. Altering cellular responses to DSBs may rebalance editing outcomes towards HDR and away from other repair outcomes. Here, we utilize a pooled CRISPR screen to define host cell involvement in HDR between a Cas9 DSB and a plasmid double stranded donor DNA (dsDonor). We find that the Fanconi Anemia (FA) pathway is required for dsDonor HDR and that other genes act to repress HDR. Small molecule inhibition of one of these repressors, CDC7, by XL413 and other inhibitors increases the efficiency of HDR by up to 3.5 fold in many contexts, including primary T cells. XL413 stimulates HDR during a reversible slowing of S-phase that is unexplored for Cas9-induced HDR. We anticipate that XL413 and other such rationally developed inhibitors will be useful tools for gene modification

    Multi-level evidence of an allelic hierarchy of USH2A variants in hearing, auditory processing and speech/language outcomes.

    Get PDF
    Language development builds upon a complex network of interacting subservient systems. It therefore follows that variations in, and subclinical disruptions of, these systems may have secondary effects on emergent language. In this paper, we consider the relationship between genetic variants, hearing, auditory processing and language development. We employ whole genome sequencing in a discovery family to target association and gene x environment interaction analyses in two large population cohorts; the Avon Longitudinal Study of Parents and Children (ALSPAC) and UK10K. These investigations indicate that USH2A variants are associated with altered low-frequency sound perception which, in turn, increases the risk of developmental language disorder. We further show that Ush2a heterozygote mice have low-level hearing impairments, persistent higher-order acoustic processing deficits and altered vocalizations. These findings provide new insights into the complexity of genetic mechanisms serving language development and disorders and the relationships between developmental auditory and neural systems

    Specific MRI abnormalities reveal severe perrault syndrome due to CLPP defects

    Get PDF
    In establishing a genetic diagnosis in heterogeneous neurological disease, clinical characterization and whole exome sequencing (WES) go hand-in-hand. Clinical data are essential, not only to guide WES variant selection and define the clinical severity of a genetic defect but also to identify other patients with defects in the same gene. In an infant patient with sensorineural hearing loss, psychomotor retardation, and epilepsy, WES resulted in identification of a novel homozygous CLPP frameshift mutation (c.21delA). Based on the gene defect and clinical symptoms, the diagnosis Perrault syndrome type 3 (PRLTS3) was established. The patient's brain-MRI revealed specific abnormalities of the subcortical and deep cerebral white matter and the middle blade of the corpus callosum, which was used to identify similar patients in the Amsterdam brain-MRI database, containing over 3000 unclassified leukoencephalopathy cases. In three unrelated patients with similar MRI abnormalities the CLPP gene was sequenced, and in two of them novel missense mutations were identified together with a large deletion that covered part of the CLPP gene on the other allele. The severe neurological and MRI abnormalities in these young patients were due to the drastic impact of the CLPP mutations, correlating with the variation in clinical manifestations among previously reported patients. Our data show that similarity in brain-MRI patterns can be used to identify novel PRLTS3 patients, especially during early disease stages, when only part of the disease manifestations are present. This seems especially applicable to the severely affected cases in which CLPP function is drastically affected and MRI abnormalities are pronounced
    • …
    corecore